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Abstract 

■ Galaxies are not uniformly distributed in space. On large scales the Universe 



displays coherent structure, with galaxies residing in groups and clusters on 
scales of ~l-3 Mpc, which lie at the intersections of long filaments of galax- 
ies that are >10 h^^ Mpc in length. Vast regions of relatively empty space, 
known as voids, contain very few galaxies and span the volume in between these 
structures. This observed large scale structure depends both on cosmological 
parameters and on the formation and evolution of galaxies. Using the two-point 
correlation function, one can trace the dependence of large scale structure on 
galaxy properties such as luminosity, color, stellar mass, and track its evolution 
with redshift. Comparison of the observed galaxy clustering signatures with 
dark matter simulations allows one to model and understand the clustering of 
galaxies and their formation and evolution within their parent dark matter ha- 
los. Clustering measurements can determine the parent dark matter halo mass 
of a given galaxy population, connect observed galaxy populations at different 
epochs, and constrain cosmological parameters and galaxy evolution models. 
This chapter describes the methods used to measure the two-point correlation 
function in both redshift and real space, presents the current results of how 
the clustering amplitude depends on various galaxy properties, and discusses 
quantitative measurements of the structures of voids and filaments. The inter- 
pretation of these results with current theoretical models is also presented. 

Index terms: clustering, angular clustering, two-point correlation function, 
higher-order correlation function, void, void probability function, filament, galaxy 
evolution, redshift space distortions, dark matter halo, halo model, bias, dark 
matter, halo occupation 
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1 Historical Background 



Large scale structure is defined as the structure or inhomogeneity of the Uni- 
verse on scales larger than that of a galaxy. The idea of whether galaxies are 
distributed uniformly in space can be traced to Edwin Hubble, who used his 
catalog of 400 "extragalactic nebulae" to test the homogeneity of the Universe 
( Hubblel . [l92l . :inding it to be generally uniform on large s cales. In 1932, the 
large r Shapley-Ames catalog of bright galaxies was published (jShaplev fc: Amesl . 



1932I) . in which the authors note "the general unevenness in distribution" of 



the galaxies projected onto the plane of the sky and the roughly factor of two 
difference in the numbers of galaxies in the no rthern and so uthern galactic 
hemispheres. Using this larger statistical sample. [llubblel ( 1934 ) noted that on 
angular scales less than ^10° there is an excess in the number counts of galax- 
ies above what would be expected for a random Poisson distribution, though 
the sample follows a Gaussian distribution on larger scales. Hence, while the 
Universe appears to be homogeneous on the largest scales, on smaller scales it 
is clearly clumpy. 

Measurements of large scale s tructure took a major leap forward with the 
Lick galaxy catalog produced by IShane &: Wirtanen (1963), which contained 
information on roughly a million galaxi es obtained using pho tographic plates at 
the 0.5m refractor at Lick Observatory. ISeldner et al.l (1977) published maps of 
the counts of galaxies in angular cells across the sky (see Fig. 1), which showed 
in much greater detail that the projected distribution of galaxies on the plane 
of the sky is not uniform. The maps display a rich structure with a foam-like 
pattern, containing possible walls or filaments with long strands of galaxies, 
clusters, and large empty regi ons. The stat i stical spatial distribution of galaxies 
from this catalog and that of Zwickv et al.l (1 1968) was analy zed by Jim Peebles 
and collaborators in a series of papers (e.g.. iPeeblesI , Il975[) that showed that 
the angular two-point correlation function (defined below) roughly follows a 
power law distribution over angular scales of ~0.1°- 5°. In these papers it was 
discovered that the clustering amplitude is lower for fainter galaxy populations, 
which likely arises from larger projection effects along the line of sight. As faint 
galaxies typically lie at larger distances, the projected clustering integrates over 
a wider volume of space and therefore dilutes the effect. 

These results in part spurred the first large scale redshift surveys, which ob- 
tained optical spectra of individual galaxies in order to measure the redshifts and 
spatial distributions of large galaxy samples. Two of of the first reds hifts sur 



veys which were the KOS survey (Kirshner, Oemlcr, Schecht er: iKirshner et aL. . 
19781) and the original CfA survey (Center for Astrophysics; iDavis et al. L ll982l) . 



The KOS survey measured redshifts for 164 galaxies brighter than magnitude 15 
in eight separate fields on the sky, covering a total of 15 deg^. Part of the moti- 
vation for the survey was to study the three dimensional spatial distribution of 
galaxies, about which the authors note that "although not entirely unexpected, 
it is striking how strongly clustered our galaxies are in velocity space," as seen 
in strongly peaked one dimensional redshift histograms in each field. 

The original CfA survey, completed in 1982, contained redshifts for 2,400 
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Figure 1: Angular distribution of counts of galaxies brighter than B ^ 19 on 
the plane of the sky, reconstructed from the Lick galaxy catalog (from Seldner 
et al. 1977). This image shows the number of galaxies observed in 10' x 10' 
cells across the northern galactic hemisphere, where brighter cells contain more 
galaxies. The northern galactic pole is at the center, with the galactic equator at 
the edge. The distribution of galaxies is clearly not uniform; clumps of galaxies 
are seen in white, with very few galaxies observed in the dark regions between. 
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Figure 2: Distribution of galaxies in redshift space from the original CfA galaxy 
redshift survey (from Davis et al. 1982). Plotted are 249 galaxies as a function 
of observed velocity (corresponding to a given redshift) versus right ascension 
for a wedge in declination of 10° < S < 20°. 



galaxies brighter than magnitude 14.5 across the north and south galactic poles, 
covering a total of 2.7 steradians. The major aims of the survey were cosmo- 
logical and included quantifying the clustering of galaxies in three dimensions. 
This survey produced the first large area and moderately deep three dimensional 
maps of large scale structure (see Fig. 2), in which one could identify galaxy 
clusters, large holes or voids, and an apparent "filamentary connected structure" 
between groups of galaxies, which the au t hors c aution could be random projec- 
tions of distinct structures ( Davis et al. . 19821 ). This paper also performed a 



comparison of the so-called "complex topology" of the large scale structure seen 
in the galaxy distribution with that seen in N-body dark matter simulations, 
paving the way for future studies of theoretical models of structure formation. 

The second CfA redshift survey, which ran from 1985 to 1995, contained 
spectra for ~5,800 galaxies and revealed the existence of the so-called "Great 
Wall" , a supercluster of galaxies that extends over 170 Mpc, the width of the 
survey (|Geller fc HuchraL Il989l) . Large underdense voids were also commonly 
found, with a density 20% of the mean density. 

Redshift surveys have rapidly progressed with the development of multi- 
object spectrographs, which allow simultaneous observations of hundreds of 
galaxies, and larger telescopes, which allow deeper surveys of both lower lu- 
minosity nearby galaxies and more distant, luminous galaxies. At present the 
largest redshift surveys of galaxies at low redsh i ft are the Two Degree Field 
Galaxy Redshift Survey (2dFGRS, IColless et all l200l[ ) and the Sloan Digital 
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Figure 3: The spatial distribution of galaxies as a function of redshift and right 
ascension (projected through 3° in declination) from the 2dF Galaxy Redshift 
Survey (from Colless et al. 2004). 



Sky Survey (SDSS, lYork et all |2000[) . which cover volumes of - 4 x 10^ 
Mpc~^ and ^ 2 x 10^ Mpc~^ with spectroscopic redshifts for ~220,000 and 
a million galaxies, respectively. These surveys provide the best current maps of 
large scale structure in the Universe tod ay (see Fig . 3), re vealing a sponge-like 



pattern to the distribution of galaxies (jGott et al.l . 119861 ). Voids of ^10 h^^ 
Mpc are clearly seen, containing very few galaxies. Filaments stretching greater 
than 10 h~'^ Mpc surround the voids and intersect at the locations of galaxy 
groups and clusters. 

The prevailing theoretical paradigm regarding the existence of large scale 
structure is that the initial fluctuations in the energy density of the early Uni- 
verse, seen as temperature deviations in the cosmic microwave background, grow 
through gravitational instability into the structure seen today in the galaxy 
density field. The details of large scale structure - the sizes, densities, and dis- 
tribution of the observed structure - depend both on cosmological parameters 
such as the matter density and dark energy as well as on the physics of galaxy 
formation and evolution. Measurements of large scale structure can therefore 
constrain both cosmology and galaxy evolution physics. 



2 The Two-Point Correlation Function 

In order to quantify the clustering of galaxies, one must survey not only galaxies 
in clusters but rather the entire galaxy density distribution, from voids to super- 
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clusters. The most commonly used quantitative measure of large scale structure 
is the galaxy two-point correlation function, £,{r), which traces the amplitude of 
galaxy clustering as a function of scale, ^(r) is defined as a measure of the ex- 
cess probability dP, above what is expected for an unclustered random Poisson 
distribution, of finding a galaxy in a volume element dV at a separation r from 
another galaxy, 

dP = n[l + C{r)]dV, (1) 

wher e n is the mean number density of the galaxy sample in question (jPeeblesl 
198Ci ). Measurements of ^(r) are generally performed in comoving space, with 
r having units of h^^ Mpc. The Fourier transform of the two-point correla- 
tion function is the power spectrum, which is often used to describe density 
fluctuations observed in the cosmic microwave background. 

To measure ^{r), one counts pairs of galaxies as a function of separation 
and divides by what is expected for an unclustered distribution. To do this 
one must construct a "random catalog" that has the identical three dimensional 
coverage as the data - including the same sky coverage and smoothed redshift 
distribution - but is populated with randomly-distribution points. The ratio of 
pairs of galaxies observed in the data relative to pairs of points in the random 
catalog is then used to estimate £,{r). Several different estimators for ^(r) have 
been proposed and tes ted. An early estimator that was widely used is from 
iDavis fc Peebles! (|l983h : 

. nnDD 
riD UK 

where DD and DR are counts of pairs of galaxies (in bins of separation) in the 
data catalog and between the data and random catalogs, and njj and n r are the 
mean number densities of galaxies in the data and random catalogs. 
(|l993f ) later introduced an estimator with smaller statistical errors. 



Hamilton 



DD RR 
{DRf 



(3) 



where RR is the count of pairs of galaxies as a function of s eparation in the 
rando m catalog. The most commonly-used estimator is from iLandv fc Szalav 
(|l993h . 
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(4) 



This estimator has been shown to perform as well as the Hamilton estimator 
(Eqn. 3), and while it requires more computational time it is less sensitive to the 
size of the random catalog and handles edge corrections well, which can affect 



clustering measurements on large scales ( Kerscher et al. . 20001 ) 



As can be seen from the form of the estimators given above, measuring ^(r) 
depends sensitively on having a random catalog which accurately reflects the 
various spatial and redshift selection affects in the data. These can include 
effects such as edges of slitmasks or fiber plates, overlapping slitmasks or plates, 
gaps between chips on the CCD, and changes in spatial sensitivity within the 
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detector (i.e., the effective radial dependence within X-ray detectors). If one is 
measuring a full three-dimensional correlation function (discussed below) then 
the random catalog must also accurately include the redshift selection of the 
data. The random catalog should also be large enough to not introduce Poisson 
error in the estimator. This can be checked by ensuring that the RR pair counts 
in the smallest bin are high enough such that Poisson errors are subdominant. 

3 Angular Clustering 

The spatial distribution of galaxies can be measured either in two dimensions as 
projected onto the plane of the sky or in three dimensions using the redshift of 
each galaxy. As it can be observationally expensive to obtain spectra for large 
samples of (particularly faint) galaxies, redshift information is not always avail- 
able for a given sample. One can then measure the two-dimensional projected 
angular correlation function uj{9), defined as the probability above Poisson of 
finding two galaxies with an angular separation 0: 

dP = N[l + co{e)]dn (5) 

where N is the mean number of galaxies per steradian and dfl is the solid angle 
of a second galaxy at a separation 6 from a randomly chosen galaxy. 

Measurements of uj{0) are known to be low by an additive factor known 
as the "integral constraint", which results from using the data sample itself 
(which often does not cover large areas of the sky) to estimate the mean galaxy 
density. This correction becomes important on angular scales comparable to 
the survey size; on much smaller scales it is negligible. One can either restrict 
measurements of the angular clustering to scales where the integral constraint 
is not important or estimate the amplitude of the integral constraint correction 
by doubly integrating an assumed power law form of oj{6) over the survey area, 
fl, using 

AC=^J J oj{e)dnidfl2, (6) 

where fl is the area subtended by the survey. In practice, this can be numerically 
estimated over the survey geometry using the random catalog itself (see Roche 
k Bales 1999 for details). 

The projected angular two-point correlation function, ui(6), can generally be 
fit with a power law, 

uji0) = A^9^ (7) 

where A is the clustering amplitude at a given scale (often 1') and 5 is the slope 
of the correlation function. 

Prom measurements of (jj{9) one can infer the three-dimensional spatial two- 
point correlation function, ^(r), if the redshift distribution of the sources is 
well known. The two-point correlation function, ^(r), is usually fit as a power 
law, ^(r) = (r/ro )"'*', where ro is the characteristic scale- length of the galaxy 
clustering, defined as the scale at which ^ = 1. As the two-dimensional galaxy 
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clustering seen in the plane of the sky is a projection of the three-dimensional 
clustering, oj{0) is directly related to its three-dimensional analog ^(r). For 
a given ^(r), one can predict the amplitude and slope of ui{6) using Limber's 
equation, effectively integrating ^(r) along the redshift direction (e.g. Peebles 
1980). If one assumes ^(r) (and thus lo{9)) to be a power law over the redshift 
range of interest, such that 



then 



ro{z) 



(8) 



where F is the usual gamma function. The amplitude factor, A, is given by 

where dN/dz is the number of galaxies per unit redshift interval and g{z) de- 
pends on 7 and the cosmological model: 

9{z) = r(i-^)F(r). (11) 

Here F(r) is the curvature factor in the Robertson- Walker metric, 

ds^ = c^de - a^[dr''/F{rf + r'^dO^ + r'^sin'^edcj)'']. (12) 

If the redshift distribution of sources, dN/dz, is well known, then the ampli- 
tude of oj{9) can be predicted for a given power law model of ^(r), such that 
measurements of uj{9) can be used to place constraints on the evolution of ^(r). 

Interpreting angular clustering results can be difficult, however, as there is 
a degeneracy between the inherent clustering amplitude and the redshift distri- 
bution of the sources in the sample. For example, an observed weakly clustered 
signal projected on the plane of the sky could be; due cither to the galaxy popu- 
lation being intrinsically weakly clustered and projected over a relatively short 
distance along the line of sight, or it could result from an inherently strongly 
clustered distribution integrated over a long distance, which would wash out the 
signal. The uncertainty on the redshift distribution is therefore often the dom- 
inant error in analyses using angular clustering measurements. The assumed 
galaxy redshift distribution {dN/dz) has varied widely in different studies, such 
that similar observed angular clustering results have led to widely different con- 
clusions. A further complication is that each sample usually spans a large range 
of redshifts and is magnitude-limited, such that the mean intrinsic luminosity 
of the galaxies is changing with redshift within a sample, which can hinder 
interpretation of the evolution of clustering measured in lj{6) studies. 
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Many of the first measurements of large scale structure were studies of an- 
Kular cl u sterin K. One of the earliest determinations was the pioneering work of 
Peebles! (Il975l) using photographic plates from the Lick survey (Fig. 1). They 



found uj{9) to be well fit by a power law with a slope of i5 = —0.8. Later studies 
using CCDs were able to reach deeper magnitude limits and found that fainter 
galaxies had a l ower clustering amplitude. One such study was conducted by 
Postman et al. ( IQQSl ). who surveyed a contiguous 4° by 4° field to a depth of 



/ab = 24 mag, reaching to z ^ 1. Later surveys that covered multiple fields on 
the sky found similar results. The lower clustering amplitude observed for galax- 
ies with fainter apparent magnitudes can in principle be due either to clustering 
being a function of luminosity and/or a function of redshift. To disentangle this 
dependence, each author assumes a dN/dz distribution for galaxies as a func- 
tion of apparent magnitude and then fits the observed 'jj{0) with different models 
of the redshift dependence of clustering. While many authors measure similar 
values of the dependence of u){9) on apparent magnitude, due to differences in 
the assumed dN / dz distributions different conclusions are reached regarding the 
amount of luminosity and redshift dependence to galaxy clustering. Addition- 
ally, quoted error bars on the inferred values of rg generally include only Poisson 
and/or cosmic variance error estimates, while the dominant error is often the 
lack of knowledge of dN/ dz for the particular sample in question. 

Because of the sensitivity of the inferred value of tq on the redshift distri- 
bution of sources, it is preferable to measure the three dimensional correlation 
function. While it is much easier to interpret three dimensional clustering mea- 
surements, in cases where it is still not feasible to obtain redshifts for a large 
fraction of galaxies in the sample, angular clustering measurements are still em- 
ployed. This is currently the case in particular with high redshift and/or very 
dusty, optically-obscured galaxy samples, such as sub- millimeter galaxies (e.g., 
B rod win et al. . 2008 : Maddox et al. . 2010l) . However, without knowledge of the 



redshift distribution of the sources, these measurements are hard to interpret. 



4 Real and Redshift Space Clustering 

Measurements of the two-point correlation function use the redshift of a galaxy, 
not its distance, to infer its location along the line of sight. This introduces 
two complications: one is that a cosmological model has to be assumed to con- 
vert measured redshifts to inferred distances, and the other is that peculiar 
velocities introduce reds hift space distortions in ^ parallel to the line of sight 



Sargent fc Turned ([1977|). On the first point, errors on the assumed cosmology 
are generally subdominant, so that while in theory one could assume different 
cosmological parameters and check which results are consistent with the as- 
sumed values, that is generally not necessary. On the second point, redshift 
space distortions can be measured to constrain cosmological parameters, and 
they can also be integrated over to recover the underlying real space correlation 
function. 

On small spatial scales (< 1 h^^ Mpc), within collapsed virialized overden- 
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sities such as groups and clusters, galaxies have large random motions relative 
to each other. Therefore while all of the galaxies in the group or cluster have a 
similar physical distance from the observer, they have somewhat different red- 
shifts. This causes an elongation in redshift space maps along the line of sight 
within overdense regions, which is referred to as "Fingers of God" . The result is 
that groups and clusters appear to be radially extended along the line of sight 
towards the observer. This effect can be seen clearly in Fig. 4, where the lower 
left panel shows galaxies in redshift space with large "Fingers of God" pointing 
back to the observer, while in the lower right panel the "Fingers of God" have 
been modeled and removed. Redshift space distortions are also seen on larger 
scales (> 1 Mpc) due to streaming motions of galaxies that are infalling 
onto structures that are still collapsing. Adjacent galaxies will all be moving 
in the same direction, which leads to coherent motion and caus es an apparen t 



_ap 

contraction of structure along the line of sight in redshift space (jKaisen . Il987[ ). 
in the opposite sense as the "Fingers of God" . 

Redshift space distortions can be clearly seen in measurements of galaxy 
clustering. While redshift space distortions can be used to uncover informa- 
tion about the underlying matter density and thermal motions of the galaxies 
(discussed below) , they complicate a measurement of the two-point correlation 
function in real space. Instead of £,{r), what is measured is ^(s), where s is 
the redshift space separation between a pair of galaxies. While some results in 
the literature present measurements of ^(s) for various galaxy populations, it is 
not straightforward to compare results for different galaxy samples and different 
redshifts, as the amplitude of redshift space distortions differs depending on the 
galaxy type and redshift. Additionally, ^(s) does not follow a power law over 
the same scales as £,{r), as redshift space distortions on both small and large 
scales decrease the amplitude of clustering relative to intermediate scales. 

The real-space correlation function, ^(r), measures the underlying physical 
clustering of galaxies, independent of any peculiar velocities. Therefore, in order 
to recover the real-space correlation function, one can measure ^ i n two dimen- 
sions, both perpendicular to and along the line of sight. Following iFisher et al 



( 19941 ) ■ Vi and V2 are defined to be the redshift positions of a pair of galaxies, 
s to be the redshift space separation (vi — V2), and 1 = i(vi -|- V2) to be the 
mean distance to the pair. The separation between the two galaxies across (vp) 
and along (tt) the line of sight are defined as 



rp — \/s • s — TT^. (14) 

One can then compute pair counts over a two-dimensional grid of separations to 
estimate ^(rp,7r). f(s), the one-dimensional redshift space correlation function, 
is then equivalent to the azimuthal average of ^{rp, tt). 

An example of a measurement of ^(r^, tt) is shown in Fig. 5. Plotted is ^ as a 
function of separation Vp (defined in this figure to be a) across and tt along the 
line of sight. What is usually shown is the upper right quadrant of this figure. 
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Figure 4: An illustration of the "Fingers of God" (FoG), or elongation of viri- 
alized structures along the line of sight, from lTegmark et aL ( 2004 ). Shown are 
galaxies from a slice of the SDSS sample (projected here through the declination 
direction) in two dimensional comoving space. The top row shows all galaxies 
in this slice (67,626 galaxies in total), while the bottom row shows galaxies that 
have been identified as having "Fingers of God" . The right column shows the 
position of these galaxies in this space after modeling and removing the effects 
of the "Fingers of God". The observer is located at (x,y=0,0), and the "Fingers 
of God" effect can be seen in the lower left panel as the positions of galaxies 
being radially smeared along the line of sight toward the observer. 
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FiRure 5: The two-dim ensional redshift space correlation function from 2dFGRS 



( Peacock et alj . 120011 ). Shown is ^{rp,Tr) (in the figure a is used instead of rp), 



the correlation function as a function of separation across (cr or rp) and along 
(tt) the line of sight. Contours show lines of constant ^ at ^ =10, 5, 2, 1, 0.5, 
0.2, 0.1. Data from the first quadrant (upper right) are reflected about both the 
a and tt axes, to emphasize deviations from circular symmetry due to redshift 
space distortions. 
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which here has been reflected about both axes to emphasize the distortions. 
Contours of constant ^ follow the color-coding, where yellow corresponds to 
large ^ values and green to low values. On small scales across the line of sight 
(rp or a <~ 2 Mpc) the contours are clearly elongated in the n direction; 
this reflects the "Fingers of God" from galaxies in virialized overdensities. On 
large scales across the line of sight (rp or a >~ 10 Mpc) the contours are 
flattened along the line of sight, due to "the Kaiser effect". This indicates that 
galaxies on these linear scales are coherently streaming onto structures that are 
stiU collapsing. 

As this effect is due to the gravitational infall of galaxies onto massive 
formin g structures, the strength of the signature depends on rimattcr- iKaiser 



(1983) derived that the large-scale anisotropy in the ^{rp^Tr) plane depends on 
/3 = r^matter/^ On linear scales, where b is the bias or the ratio of density fluctua- 
tions in the galaxy population relative to that of dark matter (discussed further 
in the next section below). Anisotropics are quantified using the multipole mo- 
ments of ^ (rp , tt) , defined as 



Us) = (2Z + 1)/2 / ^{rp,Tr) Pi{cos9) dcos9, 



(15) 



where s is the distance as measured in redshift space. Pi are Legendre poly- 
nomials, and 6 is the angle between s and the hue of sight. The ratio 
the quadrupole to monopole moments of the two-po int correlation fu nction, is 
related to /3 in a simple manner using linear theory ( Hamiltonl . 19981 ): 



^2/^0 = f{n) 



1/3- 



1/3' 



i + i/3- 



(16) 



where f{n) 

in a power-law form: £ oc r 



(3-|-n) / n and n is th e index of the tw o-point correlation function 

(3+«) 



( Hamiltonl . 11992 ) 



Peacock et al.l (|200l[ ) find using measurements of the quadrupole-to-monopole 
ratio in the 2dFGRS data (see Fig. 5) that /3 = 0.43 ± 0.07. For a bias value 
of around unity (see Section 5 below), this implies a low value of i^mattor ~ 0.3. 
Similar measurements have been made with clustering measurements using data 
from the SDSS. Very large galaxy samples are needed to detect this coherent 
infall and obtain robust estimates of /3. At higher redshift, iGuzzo et al.l (|2008l ) 
find P = 0.70 ± 0.26 at z = 0.77 using data from the VVDS and argue that 
measurements of /3 as a function of redshift can be used to trace the expansion 
history of the Universe. We return to the discussion of redshift space distortions 
on small scales below in Section 6.3. 

What is often desired, however, is a measurement of the real space clustering 
of galaxies. To recover £(r) one can then project £,{rp,Tr) along the rp axis. As 
redshift space distortions affect only the line-of-sight component of S.{rp,'K), 
integrating over the tt direction leads to a statistic Wp(rp ) , whic h is independent 
of redshift space distortions. Following iDavis &: PeeblesI (Il983l) . 



Wp{rp) 



dn ^{rp, 



dy^ir'p + y') 



2N1/2 



(17) 
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where y is the real-space separation along the line of sight. If ^(r) is modeled as 
a power-law, ^(r) = (r/rp)"''', then tq and 7 can be readily extracted from the 
projected correlation function, Wp{rp), using an analytic solution to Equation 



w„ 



ro 



7 — 1 > 

~2~ 



W = -P '^A.' ■ (18) 



'""-''KrJ r(2) 

where T is the usual gamma function. A power-law fit to Wp{rp) will then recover 
ro and 7 for the real-space correlation function, ^(r). In practice. Equation [T71 
is not integrated to infinite separations. Often values of 7r„iax are ^40-80 h^^ 
Mpc, which includes most correlated pairs. It is worth noting that the values of 
ro and 7 inferred are covariant. One must therefore be careful when comparing 
clustering amplitudes of different galaxy populations; simply comparing the rg 
values may be misleading if the correlation function slopes are different. It is 
often preferred to compare the galaxy bias instead (see next section). 

As a final note on measuring the two-point correlation function, as can be 
seen from Fig. 3, flux-limited galaxy samples contain a higher density of galaxies 
at lower redshift. This is purely an observational artifact, due to the apparent 
magnitude limit including intrinsically lower luminosity galaxies nearby, while 
only tracing the higher luminosity galaxies further away. As discussed below 
in Section 6, because the clustering amplitude of galaxies depends on their 
properties, including luminosity, one would ideally only measure ^(r) in volume- 
limited samples, where galaxies of the same absolute magnitude are observed 
throughout the entire volume of the sample, including at the highest redshifts. 
Therefore often the full observed galaxy population is not used in measurements 
of 5(r), rather volume-limited sub-samples are created where all galaxies are 
brighter than a given absolute magnitude limit. This greatly facilitates the 
theoretical interpretation of clustering measurements (see Section 8) and the 
comparison of results from different surveys. 



5 Galaxy Bias 

It was realized decades ago that the spatial clustering of observable galaxies 
need not precisely mirror the clustering of the bulk of the matter in the Uni- 
verse. In its most general form, the galaxy density can be a non-local and 
stochastic function of the underlying dark matter density. This galaxy "bias" - 
the relationship between the spatial distribution of galaxies and the underlying 
dark matter density field - is a result of the varied physics of galaxy formation 
which can cause the spatial distribution of baryons to differ from that of dark 
matter. Stochasticit y appears to have little effect on bias except for adding 



matter, btocnasticit y appears to nave little eitect on bias except tor aaamg 
extra variance fe.g.. IScoccimarrol 2000l ). and non-locality can be taken into 



account to first order by using smoothed densities over larger scales. In this 
approximation, the smoothed galaxy density contrast is a general function of 
the underlying dark matter density contrast on some scale: 

- f{S), (19) 



14 



where S = {p/p) — l and p is the mean mass density on that scale. If we assume 
f{S) is a hnear function of S, then we can define the hnear galaxies bias b as the 
ratio of the mean overdensity of galaxies to the mean overdensity of mass, 

b = Sg/S, (20) 

and can in theory depend on scale and galaxy properties such as luminosity, 
morphology, color and redshift. In terms of the correlation function, the linear 
bias is defined as the square root of the ratio of the two-point correlation function 
of the galaxies relative to the dark matter: 

b — (Cgal/Cdark matter) "^^^ (21) 

and is a function of scale. Note that ^dark matter is the Fourier transform of the 
dark matter power spectrum. The bias of galaxies relative to dark matter is 
often referred to as the absolute bias, as opposed to the relative bias between 
galaxy populations (discussed below). 

The concept of galaxies being a biased tracer of the unde rlying total m ass 
field (which is dominated by dark matter) was introduced by iKaiser (1984) in 



an attempt to reconcile the different clustering scale lengths of galaxies an d rich 
clusters, which could not both be unbiased tracers of mass. iKaiser ( 1984( ) show 



that clusters of galaxies would naturally have a large bias as a result of being 
rare objects which formed at the highest density peaks of the mass distribution, 
above some cr i tical threshold. This idea is further developed analytically by 
Bardeen et al.l ( 1986[ ) for galaxies, who show that for a Gaussian distribution of 



initial mass density fluctuations, the peaks which first collapse to form galaxies 



will b e more clustered than the underlying mass distribution. iMo fc White 



( 1996f ) use extended Press-Schechter theory to determine that the bias depends 



on the mass of the dark matter halo as well as the epoch of galaxy formation 
and that a linear bias is a decent approximation well into the non-linear regime 
where S >1. The evolu tion of bias with redshift is developed in theoretical 
work by 1^3 (|l996l ) and iTegmark fc Peeblej (|l998[) . who find that the bias is 



naturally larger at earlier epochs of galaxy formation, as the first galaxies to 
form will collapse in the most overdense regions of space, which are biased (akin 
to mountain peaks being clustered). They further show that regardless of the 
initial amplitude of the bias factor, with time galaxies will beco me unbiased 
tracers of the mass distribution (6 — >1 as t oo). Additionally, iMann et al 



( 1998h find that while bias is generally scale-dependent, the dependence is weak 



and on large scales the bias tends towards a constant value. 

A galaxy population can be "anti-biased" if 6 < 1, indicating that galaxies 
are less clustered than the dark matter distribution. As discussed below, this 
appears to be the case for some galaxy samples at low redshift. The galaxy bias 
of a given observational sample is often inferred by comparing the observed clus- 
tering of galaxies with the clustering of dark matter measured in a cosmological 
simulation. Therefore the bias depends on the cosmological model used in the 
simulation. The dominant relevant cosmological parameter is as, defined as the 
standard deviation of galaxy count fluctuations in a sphere of radius 8 Mpc, 
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and the absolute bias value inferred can be simply scaled with the assumed value 
of as- As discussed in section 9.1 below, the absolute galaxy bias can also be 
estimated from the data directly, without having to resort to comparisons with 
cosmological simulations, by using the ratio of the two-point and three-point 
correlation functions, which have different dependencies on the bias. While this 
measurement can be somewhat noisy, it has the advantage of not assuming a 
cosmological model from which to derive the dark matter clustering. This mea- 
surement is performed by I Verde et al. ( 2002 ) and Gaztahaga et al. ( 20051 ). who 
find that galaxies in 2dFGRS have a linear bias value very close to unity on 
large scales. 

The relative bias between different galaxy populations can also be measured 
and is defined as the ratio of the clustering of one population relative to another. 
This is often measured using the ratio of the projected correlation functions of 
each population: 

^gall/gal2 = (Wp,gall/Wp,gal2)^/^ (22) 

where both measurements of Wp{rp) have been integrated to the same value of 
T^max ■ The relative bias is used to compare the clustering of galaxies as a function 
of observed parameters and does not refer to the clustering of dark matter. It is 
a useful way to compare the observed clustering for different galaxy populations 
without having to rely on an assumed value of erg ioi the dark matter. 



6 The Dependence of Clustering on Galaxy Prop- 
erties 

The two-point correlation function has long been known to depend on galaxy 
properties and can vary as a function of galaxy luminosity, morphological or 
spectral type, color, stellar mass, and redshift. The general trend is that galax- 
ies that are more luminous, early-type, bulge-dominated, optically red, and/or 
higher stellar mass are more clustered than galaxies that are less luminous, 
late-type, disk-dominated, optically blue, and/or lower stellar mass. Presented 
below are relatively recent results indicating how clustering properties depend 
on galaxy properties from the largest redshifts surveys currently available. The 
physical interpretation behind these trends is presented in Section 8 below. 



6.1 Luminosity Dependence 

Fig. 3 shows the large scale structure reflected in the galaxy distribution at low 
redshift. What is plotted is the spatial distribution of galaxies in a flux-limited 
sample, meaning that all galaxies down to a given apparent magnitude limit are 
included. This results in the apparent lack of galaxies or structure at higher 
redshift in the figure, as at large distances only the most luminous galaxies 
will be included in a fiux-limited sample. In order to robustly determine the 
underlying clustering, one should, if possible, create volume-limited subsamples 
in which galaxies of the same luminosity can be detected at all redshifts. In 
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this way the mean luminosity of the sample does not change with redshift and 
galaxies at all redshifts are weighted equally. 

The left panel of Fig. 6 shows the projected correlation function, Wp{rp), 
for galaxies in SDSS in volume-limited subsamples corresponding to different 
absolute magnitude ranges. The more luminous galaxies are more strongly 
clustered across a wide range in absolute optical magnitude, from — 17 < Mr < 
—23. Power law fits on scales from ~0.1 Mpc to '-^lO Mpc show that 
while the clustering amplitude depends sensitively on luminosity, the slope does 
not. Only in the brightest and faintest magnitude bins does the slope deviate 
from 7 ~ 1.8 and have a steeper value of 7 2.0. Across this magnitude range 
ro varies from ~2.8 Mpc at the faint end to ~10 Mpc at the bright 
end. This same general tre nd is found in the 2dFGRS and other redshift surveys 



(e.g., iNorberg et all . 120011) 



The right panel of Fig. 6 shows the relative bias of SDSS galaxies as a 
function of luminosity, relative to the clustering of L* galaxies, measured at 
the scale of = 2. 7 Mpc, which is in the non-linear regime where 5 > 1 
(jZehavi et al.l . l2005l ). L* IS the characteristic galaxy luminosity, defined as the 



luminosity of the break in the galaxy luminosity function. The relative bias is 
seen to steadily increase at higher lumin osity and rise sharply above L*. This is 
in good agreement with the results from lTegmark et al. (2004), using the power 



spectrum of SDSS galaxies measured in the linear regime on a scale of ~ 100 
Mpc. T he data also agree wi th the clustering results of galaxies in the 



2dFGRS from INorberg et al.l ()2001l ). The overall shape of the relative bias with 
luminosity indicates a slow rise up to the value at L*, above which the rise is 
much steeper. As discussed in Section 8.2 below, this trend shows that brighter 
galaxies reside in more massive dark matter halos than fainter galaxies. 

6.2 Color and Spectral Type Dependence 

The clustering strength of galaxies also depends on restframe color and spectral 
type, with a stronger dependence than on luminosity. Fig. 7 shows the spatial 
distribution of galaxies in SDSS, color coded as a function of restframe color. 
Red galaxies are seen to preferentially populate the most overdense regions, 
while blue galaxies are more smoothly distributed in space. This is reflected 
in the correlation function of galaxies split by restframe color. Red galaxies 
have a larger correlation length and steeper slope than blue galaxies: ro'^b-Q 
Mpc and 7 ~2.0 for red L* galaxies, while rn~3-4 Mpc and 7 ~1.7 
for blue L* galaxies in SDSS (.Zehavi et al. . 20051 ). Clustering studies from the 



2dFGRS split the galaxy sample at low redshift by spectral type into galaxies 
with emission line spectra versus absorption line spectra, corresponding to star 
forming and quiescent galaxies, and find similar results: that quiescent galaxies 
have larg er correlation lengths a nd steeper clustering slopes than star forming 



galaxies (jMadgwick et al.l . 120031 ). 

Red and blue galaxies have distinct luminosity-dependent clustering proper- 
ties. As shown in Fig. 8, the general trends seen in vq and 7 with luminosity for 
all galaxies are well-reflected in the blue galaxy population; however, at faint 
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Figure 6: Luminosity-dependence of galaxy clustering. On the left is shown the 
projected correlation function, Wp{rp), for SDSS galaxies in different absolute 
magnitude ranges, where brighter galaxies are seen to be more clustered. On 
the right is the relative bias of galaxies as a function of luminosity. Both figures 
are from iZehavi et al., (j2005|) . 



luminosities {L < 0.5L*) red galaxies have larger clustering amplitudes and 
slopes than L* red galaxies. This reflects the fact that faint red galaxies are 
often found distributed throughout galaxy clusters. 

Galaxy clustering also depends on other galaxy properties such as stellar 
mass, concentration index, and the strength of the 4000A break (154000)1 in 
that galaxies that have larger stellar mass, more centrally concentrated light 
profiles, and/or larg er ^4000 measur ements (indicating older stellar populations) 
are more clustered (|Li et all . [2OO6I) . This is not surprising given the observed 



trends with luminosity and color and the known dependencies of other galaxy 
properties with luminosity and color. Clearly the galaxy bias is a complicated 
function of various galaxy properties. 

6.3 Redshift Space Distortions 

The fact that red galaxies are m ore clustered th an blue galaxies is related to 



the morphology-density relation (Dressier, 19801 ). which results from the fact 



that galaxies with elliptical morphologies are more likely to be found in regions 
of space with a higher local surface density of galaxies. The redshift space 
distortions seen for red and blue galaxies also show this. 

As discussed in Section 4, redshift space distortions arise from two different 
phenomena: virialized motions of galaxies within collapsed overdensities such as 
groups and clusters, and the coherent streaming motion of galaxies onto larger 
structures that are still collapsing. The former is seen on relatively small scales 
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Figure 7: The spatial distribution of galaxies in the SDSS main galaxy sample 
as a function of redshift and right ascension, projected through 8° in declina- 
tion, color coded by restframe optical color. Red galaxies are seen to be more 
clustered than blue galaxies and generally trace the centers of groups and clus- 
ters, while b l ue ga laxies populate further into the galaxy voids. Taken from 



Zehavi et al.l (|2011[ ) 
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Figure 8: The clustering scale length, tq (left), and slope, 7 (right), for all, 
red, and blue galaxies in SDSS as a function of luminosity. While all galaxies 
are more clustered at brighter luminosities, and red galaxies are more clustered 
than blue galaxies at all luminosities, below L* the red galaxy clustering length 
increases at fainter luminosities. The clustering slope fo r faint red ga l axies is 



also much steeper than at other luminosities. Taken from lZehavi et al.l (|201l[ ). 




Figure 9: Two-dimensional redshift space correlation function ^(rp,7r) (as in 
Fig. 6 here a is used instead of rp) for quiescent, absorption line galaxies on the 
left and star forming, emission line galaxies on the right. The redshift space dis- 
tortions are different for the different galaxy populations, with quiescent and/or 
red galaxies showing more pronounced "Fingers of God". Both galaxy types 
exhibit coherent infall on large scales. Contours show lines of constant ^ at 
^ =10, 5, 2, 1, 0.5, 0.2, 0.1. Taken from iMadgwick et al.l (|2003f ). 



20 



i^p ^ 1 Mpc) while the latter is detected on larger scales {rp > 1 Mpc). 
While the presence of redshift space distortions complicates the measurement of 
the real space ^(r), these distortions can be used to uncover information about 
the thermal motions of galaxies in groups and clusters as well as the amplitude 
of the mass density of the Universe, r^mattcr- 

Fig. 9 shows ^(^p, tt) for quiescent and star forming galaxies in 2dF. The qui- 
escent galaxies on the left show larger "Fingers of God" than the star forming 
galaxies on the right, reflecting the fact that red, quiescent galaxies have larger 
motions relative to each others. This naturally arises if red, quiescent galax- 
ies reside in more massive, virialized overdensities with larger random peculiar 
velocities than star forming, optically blue galaxies. The large scale coherent 
infall of galaxies is seen both for blue and red galaxies, though it is often easier 
to see for blue galaxies, due to their smaller "Fingers of God" . 

These small scale redshift space distortions ca n be quantified using the ai2 
statis tic, known as the pairwise velocity dispersion ([Davis et al. iFisher et al 



1994 ). This is measured by modeling ^(rp,7r) in real space, which is then con- 



volved with a distribution of random pairwise motions, f{v), such that 

arp,7T)^ C{rp,n-v/Ho)f{v)dv, (23) 



where the random motions are often taken to have an exponential form, which 
has been found to fit observed data well: 

cri2V2 \ cri2 / 



In the 2dFGRS iMadgwick et all (|2003[ ) find that CTi2= 416 ± 76 km s^^ for 



star forming galaxies and ai2= 612±92 km s ^ for quiesce i it gala xies, measured 
on scales of 8-20 h'^ Mpc. Using SDSS data lZehavi et al.l (|2002^ find that CTi2is 



^ 300 — 450 km s^^ for blue, star forming galaxies and ~ 650 — 750 km for 
red, quiescent galaxies. It has been shown, however, that this statistic can be 
sensitive to large, rare overdensities, such that samples covering large volumes 
are needed to measures it robustly. 

IMadgwick et al. ( 20031 ) further measure the large scale anisotropics seen in 



^(rp,7r) for galaxies split by spectral type and find that /3 = 0.49 ± 0.13 for 
star forming galaxies and /? = 0.48 ± 0.14 for quiescent galaxies. This implies 
a similar bias for both galaxy types on large scales, though they find that on 
smaller scales integrated up to 8 Mpc, the relative bias of quiescent to star 
forming galaxies is brei = 1-45 ± 0.14. 



7 The Evolution of Galaxy Clustering 

The observed clustering of galaxies is expected to evolve with time, as structure 
continues to grow due to gravity. The exact evolution depends on cosmological 
parameters such as A and rimattcr- Larger values of A, for example, lead to 
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larger voids and higher density contrasts between overdense and underdense 
regions. By measuring the clustering of galaxies at higher redshift, one can break 
degeneracies that exist between the galaxy bias and cosmological parameters 
that are constrained by low redshift clustering measurements. It is therefore 
useful to determine the clustering of galaxies as a function of cosmic epoch, not 
only to further constrain cosmological parameters but also galaxy evolution. 

One might expect the galaxy clustering amplitude tq to increase over time, 
as overdense regions become more overdense as galaxies move towards groups 
and clusters due to gravity. However, the exact evolution of the clustering 
of galaxies depends not only on gravity, but also on the expansion history of 
the Universe and therefore cosmological parameters such as A. Additionally, 
over time new galaxies form while existing galaxies grow in both mass and 
luminosity. Therefore, the expected changes of galaxy clustering as a function 
of redshift depend both on relatively well-known cosmological parameters and 
more unknown galaxy formation and evolution physics which likely depends on 
gas accretion, star formation, and feedback processes, as well as mergers. 

For a given cosmological model, one can predict how the clustering of dark 
matter should evolve with time using cosmological N-body simulations. For 
a ACDM Universe, tq for dark matter particles is ex pected to increa s e from 



0.8 h-^ Mpc at z = 3 to -5 h''^ Mpc at z = (jWeinberg et all . 120041 ) . 



However, according to hierarchical structure formation theories, at high redshifts 
the first galaxies to form will be the first structures to collapse, which will 
be biased tracers of the mass. The galaxy bias is expected to be a strong 
function of redshift, initially > 1 at high redshift and approaching unity over 
time. Therefore, ro for galaxies may be a much weaker function of time than it is 
for dark matter, as the same galaxies are not observed as a function of redshift, 
and over time new galaxies form in less biased locations in the Universe. 

The projected angular and three dimensional correlation functions of galaxies 
have been observed to z ~ 5. Star- forming Lyman break galaxies at z ^ 3 — 5 are 
found to have m ^ 4 — 6 Mpc, with a bia s relative to dark matter of ^ 3 — 4 
( Ouchi et al. . 2004 : Adelberger et al.l 2005 ). Brighter Lyman break galaxies 
are found to cluster more strongly than fainter Lyman break galaxies. The 
correlation length, ro, is found to be roughly constant between z = 5 and z = 3, 
implying that the bias is increasing at earlier cosmic epoch. Spectroscopic galaxy 
surveys at z > 2 are currently limited to samples of at most a few thousand 
galaxies, so m ost clustering mea surements are angular at these epochs. In one 
such study bv lWake et al.l ( 201l[ ). photometric redshifts of tens of thousands of 
galaxies at 1 < z < 2 are used to measure the angular clustering as a function of 
stellar mass. They find a strong dependence of clustering amplitude on stellar 
mass in each of three redshift intervals in this range. 

At z ^ 1 larger spectroscopic galaxy samples exist, and three dimensional 
two-point clustering analyses have been performed as a function of luminosity, 
color, stellar mass, and spectral type. The same general clustering trends with 
galaxy property that are observed at z ~ are also seen at z ~ 1, in that 
galaxies that are brighter, redder, early spectral type, and/or more massive 
are more clustered. The clustering scale length of red galaxies is found to be 
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~ 5 — 6 /i ^ Mpc while for blue galaxies it is ^ 3.5 — 4.5 ft ^ Mpc, depending on 
luminosity ( Meneux et al. . 12006 : ICoil et al.l l2008fl . At a given luminosity the 
observed correlation length is only ~15% smaller at z = 1 than z — 0, indicating 
that unlike for dark matter the galaxy tq is roughly constant over time. These 
results are consistent with predictions from ACDM simulations. 

The measured values of tq at z ~ 1 imply that are more biased at 2; = 1 
than at z = 0. Within the DEEP2 sample, the bias measured on scales of 
~ 1 — 10 h^^ Mpc varies from ~ 1.25 — 1.5 5, with bri g hter g alaxy samples being 
more biased tracers of the dark matter ( Coil et all . |2006() . These results are 
consistent with the idea that galaxies formed early on in the most overdense 
regions of space, which are biased. 

As in the nearby Universe, the clustering amplitude is a stronger function 
of color than of luminosity at z ^ 1. Additionally, the color-density relation is 
found to already be in place at z = 1, in that the relative bias of red to blue 
galaxies is as high at z = 1 as at z = 0.1 ( Coil et all 2008f) . This implies that 
the color-density relation is not due to cluster-specific physics, as most galaxies 
at z = 1 in field spectroscopic surveys are not in clusters. Therefore it must 
be physical processes at play in galaxy groups that initially set the color and 
morphology-density relations. Red galaxies show larger "Fingers of God" in 
^(rp,7r) measurements than blue galaxies do, again showing that red galaxies 
at z = 1 lie preferentially in virialized, more massive overdensities compared to 
blue galaxies. Both red and blue galaxies show coherent infall on large scales. 



8 Halo Model Interpretation of ^(r) 

The current paradigm of galaxy formation posits that galaxies form in the center 
of larger dark matter halos, collapsed overdensities in the dark matter distri- 
bution with p/p 200, inside of which all mass is gravitationally bound. The 
clustering of galaxies can then be understood as a combination of the cluster- 
ing of dark matter halos, which depends on cosmological parameters, and how 
galaxies populate dark matter halos, which depends on galaxy formation and 
evolution physics. For a given cosmological model the properties of dark mat- 
ter halos, including their evolution with time, can be studied in detail using 
N-body simulations. The masses and spatial distribution of dark matter halos 
should depend only on the properties of dark matter, not baryonic matter, and 
the expansion history of the Universe; therefore the clustering of dark matter 
halos should be insensitive to baryon physics. However, the efficiency of galaxy 
formation is very dependent on the complicated baryonic physics of, for exam- 
ple, star formation, gas cooling, and feedback processes. The halo model allows 
the relatively simple cosmological dependence of galaxy clustering to be cleanly 
separated from the more complex baryonic astrophysics, and it shows how clus- 
tering measurements for a range of galaxy types can be used to constrain galaxy 
evolution physics. 
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8.1 Estimating the Mean Halo Mass from the Bias 

One can use the observed large scale clustering amplitude of different observed 
galaxy populations to identify the typical mass of their parent dark matter 
halos, in order to place these galaxies in a cosmological context. The large 
scale clustering amplitude of dark matter halos as a function of halo mass is 
well determine d in N-body simulations, and analyti c fittin g formula are pro- 
vided by e.g., IMo fc Whit^ ([l996) and ISheth et all (|200ll) . Analytic models 
can then predict the clustering of both dark matter particles and galaxies as a 
function of scale, by using the clustering of dark matter hal os and the radial 
density profile of dark matter and gala xies within those halos ( Ma fc Fry . 20001 : 
Peacock fc Smit hi. l2000t [Seiial l2000h . In this scheme, on large, linear scales 
where 5<l(p/p~l), the clustering of a given galaxy population can be used 
to determine the mean mass of the dark matter halos hosting those galaxies, for 
a given cosmological model. To achieve this, the large-scale bias is estimated by 
comparing the observed galaxy clustering amplitude with that of dark matter 
in an N-body simulation, and then galaxies are assumed to reside in halos of a 
given mass that have the same bias in simulations. 

Simulati ons show that higher m ass halos cluster more strongly than lower 



mass halos ( Sheth fc Tormenl . I1999I) . This then leads to an interpretation of 



galaxy clustering as a function of luminosity in which luminous galaxies reside 
in more massive dark matter halos than less luminous galaxies. Similarly, red 
galaxies typically reside in more massive halos than blue galaxies of the same 
luminosity; this is observationally verified by the larger "Fingers of God" ob- 
served for red galaxies. Combining the large scale bias with the observed galaxy 
number density further allows one to constrain the fraction of halos that host 
a given galaxy type, by comparing the galaxy space density to the parent dark 
matter halo space density. This constrains the duty cycle or fraction of halos 
hosting galaxies of a given population. 



8.2 Halo Occupation Distribution Modeling 

While such estimates of the mean host halo mass and duty cycle are fairly 
straightforward to carry out, a greater understanding of the relation between 
galaxy light and dark matter mass is gleaned by performing halo occupation 
distribution modeling. 

The general halo-based model discussed above, in which the clustering of 
galaxi es reflects the clustering of halos, was further developed bv lPeacock fc SmithI 
(I2OOO') to include the efficiency of galaxy formation, or how galaxies populate 
halos. The proposed model depends on both the halo occupation number, equal 
to the number of galaxies in a halo of a given mass, for a galaxy sample brighter 
than so me limit, and th e location of the galaxies within these halos. In the 
[Peacock fc Smith (l2000l ) model it is assumed that one galaxy is at the center of 
the halo (the "central" galaxy), and the rest of the galaxies in the same halo are 
"satellite" galaxies that trace the dark matter radial mass distribution, which 
follows an NFW profile (jNavarro et all , 119971 ). The latter assumption results in 
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a general power law shape for the gal axy correlatio n funct ion. 

A similar idea was proposed by iBenson et al. who used a semi- 

analytic model in conjunction with a cosmological N-body simulation to show 
that the observed galaxy ^(r) could be reproduced with a ACDM simulation 
(though not with a r CDM simulation with fimattcr = !)■ They also employ 
a method for locating galaxies inside dark matter halos such that one galaxy 
resides at the center of all halos above a given mass threshold, while additional 
galaxies are assigned the location of a random dark matter particle within the 
same halo, such that galaxies have the same NFW radial profile within halos as 
the dark matter particles (see Fig. 10, left panel). 

In these models, the clustering of galaxies on scales larger than a typical 
halo (^1 — 2 h^^ Mpc) results from pairs of galaxies in separate halos, called 
the "two-halo term", while the clustering on smaller scales (< 1 Mpc) is 
due to pairs of galaxies within the same parent halo, called the "one-halo term" . 
When the pairs from these two terms are added together, the resulting galaxy 
cor relation funct i on sho uld roughly follow a power law. 

Benson et al. (I2OOOI) show that on large scales there is a simple relation 



in the bias between galaxies and dark matter halos, while on small scales the 
correlation function depends on the number of galaxies in a halo and the finite 
size of halos. When the clustering signal from these two scales (corresponding to 
the "two-halo" and "one-halo" terms) is combined, a power law results for the 
galaxy ^(r) (right panel. Fig. 10). Galaxies arc found to be anti-biased relative 
to dark matter (i.e., less clustered than the dark matter) on scales smaller than 
the typical halo, though the bias is close to unity on larger scales. The clustering 
of galaxies that results from this semi-analytic model is also found to match the 
observed c lustering of g alaxies in the APM survey, above a given luminosity 
threshold (|Baughl [1999) . 

By defining the halo occupation distribution (HO D) as the probability that a 
halo of a given mass contains N galaxies, PfA^|M). iBerlind fc Weinberg (|2002l ) 
quantify how the observed galaxy ^(r) depends on different HOD model pa- 
rameters. Using N-body simulations, they identify dark matter halos and place 
galaxies into the simulation using a simple HOD model with two parameters: 
a minimum mass at which a halo hosts, on average, one central galaxy (Mmj„) 
at the center of the halo, and the slope (a) of the P{N\M) function for satel- 
lite galaxies. The latter determines how many satellite galaxies there are as 
a function of halo mass. They further assume that the satellite galaxies fol- 
low an NFW profile, as the dark matter does, though the concentration of the 
radial profile can be changed. They show that the "two-halo term" is simply 
the halo center correlation function weighted by a large scale bias factor, while 
the "one-halo term" is sensitive to both a and the concentration of the galaxy 
profile within halos. Obtaining a power law ^(r) therefore strongly constrains 
the HOD model paramet ers. 



Kravtsov et al.l (|2004l ) propose that the locations of satellite galaxies within 



dark matter halos should correspond to locations of subhalos, distinct gravita- 
tionally bound regions within the larger dark matter halos, instead of tracing 
random dark matter particles. Using cosmological N-body simulations, they 
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Figure 10: Left: The large scale structure seen in a ACDM N-body dark matter 
only simulation of size 141 x 141 x 8 h~^Mpc^ . The grey scale indicates the 
density of dark matter, while the locations of galaxies are shown with open 
circles. Galaxies are added to the simulation output using a semi-analytic model 
which assumes that dark matter halos above a given mass threshold have at least 
one "central" galaxy located at the center of the halo. Higher mass halos contain 
additional "satellite" galaxies, which are assigned the l o cation of a random dark 
matter particle in the halo. Taken from iBenson et al. I (I2OOOI) . Right: The two- 
point correlation function of dark matter particles (dotted line) and galaxies 
(solid line wi t h das hed line showing Poisson error bars) in the simulation of 
Benson et al. I (|2000l) . compar ed with the o bserved clustering of galaxies in the 



APM survey (open squares) (jBaugh . 1996[) . 
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show that at z > 1 ^(r) for galaxies should deviate strongly from a power law 
on small scales, due to a rise in the "one-halo term" . In this model, the cluster- 
ing of galaxies can be understood as the clustering of dark matter parent halos 
and subhalos, and the power law shape that is observed at 2 ~ is a coinci- 
dence of the one- and two-halo terms having similar amplitudes and slopes at 
the typical scale of halos. They find that the formation and evolution of halos 
and subhalos through merging and dynamical processes are the main physical 
drivers of large scale structure. 

With the unprecedentedly large galaxy sample with spectroscopic redshifts 
that is provi d ed by SDSS, departures from a power law ^(r) were detected by 



Zehavi et al.l (|2004l ). using a volume-limited subsample of 22,000 galaxies from 
a parent sample of 118,000 galaxies. The deviations from a power law are small 
enough at z ~ that a large sample covering a sufficiently large cosmological 
volume is required to overcome the errors due to cosmic variance to detect these 
small deviations. It is found that there is a change in the slope of ^(r) on scales 
of ~l-2 Mpc; this corres ponds to the scale at which the one and two halo 



sp 

term are equal (see Fig. 11). IZehavi et al.l (|2004[ ) find that ^(r) measured from 



the SDSS data is better fit by an HOD model, which includes small deviations 
from a power law, than by a pure power law. The HOD model that is fit has 
three parameters: the minimum mass to host a single central galaxy (Mmin), the 
minimum mass to host a single satellite galaxy (Mi), and the slope of P{N\M) 
(a), which determines the average number of satellite galaxies as a function 
of host halo mass. In this model, dark matter halos with M^i^ < M < Mi 
host a single galaxy, while above Mi they host, on average, (M/Mi)" galaxies. 
Using ■Wp{rp), one can fit for Mi and a, while the observed space density of 
galaxies is used to derive Mmin- For a galaxy sample with Mr < — 21, the best- 
fit HOD parameters are Mmi„ = 6.1 x lO^^ /j-IMq, Mi ^ 4.7 x lO^^ h-'^Mq, 
and a = 0.89. 

8.3 Interpreting the Luminosity and Color Dependence of 
Galaxy Clustering 

In general, these HOD parameters reflect the efficiency of galaxy formation and 
evolution and can be a functio n of galaxy properti es such as luminosity, color, 
stellar mass, and morphology. IZehavi et al. ( 2011 ) present HOD fits to SDSS 



samples as a function of luminosity and color and find that a is generally ^1.0- 
1.1, though it is a bit higher for the brightest galaxies (~1.3 for Mr < —22.0). 
There is a strong trend between luminosity and halo mass; M,„in varies as a 
function of luminosity from 10" /i^^M© for Mr < -18 to lO^"* H^^Mq 
for Mr < —22. Ml is generally ~17 times higher than the value of M^i^ for all 
luminosity threshold samples (see Fig. 12). This implies that a halo with two 
galaxies above a given luminosity is ^17 times more massive than a halo hosting 
one galaxy above the same luminosity limit. Further, the fraction of galaxies 
that are satellites decreases at higher luminosities, from ~33% at Mr < —18 to 
4% at Mr < —22. The right panel of Fig. 12 shows the mass-to-light ratio of the 
virial halo mass to the central galaxy r-band luminosity as a function of halo 
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Figure 11; The projeeted correlation function, Wp{rp), for SDSS galaxies with 
Mr < —21 is shown as data points with error bars. The best-fit HOD model is 
shown as a solid line, with the contributions from the one and two halo terms 
shown with dotted lines. The projected correlation function of dark matter at 
this redshift is shown with a dashed line. The bottom panel shows deviations 
in W n(rn) for the data an d the HOD model from the best-fit power law. Taken 



from lZehavi et al.l ( 20041) 



mass. This figure shows that halos of mass '-^ 4 x 10^^ /i~^Mq are maximally 
efficient at galaxy formation, at converting baryons into light. 

In terms of the color dependence of galaxy clustering, the trend at fainter 
luminosities of red galaxies being strongly clustered (with a higher correlation 
slope, 7, see Fig. 8) is due to faint red galaxies being sat ellite galaxies i n rela - 



tively massive halos that host bright red central galaxies (jBerlind et al.l . 120051 ) 



HOD modeling therefore provides a clear explanation for the increased clustering 
observed for f aint r ed galaxies. For a given luminosity range (—20 < Mr < —19) 
Zehavi et al. I (l201ll) fit a simplified HOD model with one parameter only to find 



that the fraction of galaxies that are satellites is much higher for red than for 
blue galaxies, with ~25% of blue galaxies being satellites and ~60% of red 
galaxies being satellites. They find that blue galaxies reside in halos with a 
median mass of 10^^'^ H^^Mq, while red galaxies reside in higher mass halos 
with a median mass of 10^^'^ H^-^Mq. However, at a given luminosity, there is 
not a strong trend between color and halo mass (though there is a strong trend 
between luminosity and halo mass). Instead, the differences in Wp{rp) refiect a 
trend between color and satellite fraction; the increased satellite fraction, in par- 
ticular, drives the slope of ^(r) to be steeper for red galaxies compared to blue 
galaxies. And while the HOD slope a, does not change much with increasing 
luminosity, it does with color, due to the dependence of the satellite fraction on 
color. Having a higher satellite fraction also places more galaxies in high mass 
halos (as those host the groups and clusters that contain the satellite galaxies). 
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Figure 12: Left: The characteristic raass scale of dark matter halos hosting 
galaxies as a function of the luminosity threshold of the galaxy sample. Both 
the minimum halo mass to host a single galaxy is shown (Ai,„in) as well as the 
minimum mass to host additional satellite galaxies (Mi). A strong relationship 
clearly exists between halo mass and galaxy luminosity. Right: The ratio of the 
halo mass t o the median centra l galaxy luminosity as a function of halo mass. 
Taken from lZehavi et all (|201ll ). 



which increases the large scale bias and boosts the one halo term relative to the 
two halo term. The HOD model facilitates interpretion of the observed lumi- 
nosity and color dependence of galaxy clustering and provides strong, crucial 
constraints on models of how galaxies form and evolve within their parent dark 
matter halos. 

8.4 Interpreting the Evolution of Galaxy Clustering 

As mentioned in Section 7 above, the galaxies that are observed for clustering 
measurements at different redshifts are not necessarily the same populations 
across cosmic time. A significant hurdle in understanding galaxy evolution is 
knowing how to connect different observed populations at different redshifts. 
Galaxy clustering measurements can be combined with theoretical models to 
trace observed populations with redshift, in that for a given cosmology one can 
model how the clustering of a given population should evolve with time. 

The observed evolution of the luminosity-dependence of galaxy clustering 
can be fit surprisingly well using a simple non-parametric, non-HO D, model that 
relate s the galaxy luminosity function to the halo mass function. IConrov et al. 
( 2006h show that directly matching galaxies as a function of luminosity to host 
halos and subhalos as a function of mass leads to a model for the luminosity- 
dependent clustering that matches observation from z ~ to z ~ 3. In this 
model, the only inputs are the observed galaxy luminosity function at each 
epoch of interest and the dark matter halo (and subhalo) mass function from 
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N-body simulations. Galaxies are then ranked by luminosity and halos by mass 
and matched one-to-one, such that lower luminosity galaxies are associated with 
halos of lower mass, and galaxies above a given luminosity threshold are assigned 
to halos above a given mass threshold with the same abundance or number 
density. This "abundance matching" method uses as a proxy for halo mass the 
maximum circular velocity (Vmax) of the halo; for subhalos they find that it 
is necessary to use the value of T4iax when the subhalo is first accreted into 
a larger halo, to avoid the effects of tidal stripping. With this simple model 
the clustering amplitude and shape as a function of luminosity are matched for 
SDSS galaxies at z ~ 0, DEEP2 galaxies at z ^ 1, and Lyman break galaxies 
at z ~ 3. In particular, the clustering amplitude in both the one and two halo 
regimes is we l l fit, i i icluding th e devia tions from a power law that seen at 2 > 1 
( Ouchi et al.l . l2005t ICoil et aL . 2006 ). These results imply a tight correlation 
between galaxy luminosity and halo mass from z ~ to z ~ 3. 

While abundance-matching techniques provide a simple, zero parameter 
model for how galaxies populate halos, a richer understanding of the physical 
prope rties involved may be gained by performing HOD modeling. I Zheng et al " 



(|2007r ) use HOD modeling to fit the observed luminosity-dependent galaxy clus- 
tering at z measured in SDSS with that measured at z ~ 1 in DEEP2 to 
confirm that at both epochs there is a tight relationship between the central 
galaxy luminosity and host halo mass. At z ^ 1 the satellite fraction drops for 
higher luminosities, as at z ^ 0, but at a given luminosity the satellite fraction is 
higher at z ~ than at z ^ 1. They also find that at a given central luminosity, 
halos are ~1.6 times more massive at z ^ than z ^ 1, and at a given halo 
ma ss galaxies are ^1. 4 times more luminous at z ~ 1 than z ~ 0. 



Zheng et al.l (j2007l ) further combine these HOD results with theoretical pre- 



dictions of the growth of dark matter halos from simulations to link z 1 central 
galaxies to their descendants at z ~ and find that the growth of both halo 
mass and stellar mass as a function of redshift depends on halo mass. Lower 
mass halos grow earlier, which is reflected in the fact that more of their z ^ 
mass is already assembled by z ^ 1. A typical z ^ halo with mass 3 x 10^^ 
h~^MQ has about 70% of its final mass in place by z ~ 1, while a z ~ halo 
with mass 10^'' H^^Mq has ^50% of its final mass in place at z ^ 1. In terms of 
stellar mass, however, in a z ^ halo of mass 5 x 10"'^^ h~^MQ a central galaxy 
has ^20% of its stellar mass in place at z 1, while the fraction rises to ^33% 
above a halo mass of 2 x 10^^ H^^Mq. They further find that the mass scale of 
the maximum star formation efficiency for central galaxies shifts to lower halo 
mass with time, with a peak of ~ 10^^ h~^MQ at z '--^ 1 and ~ 6 x 10^^ h~^MQat 
z - 0. 



At 1 < z < 2. 1 Wake et al.l (|201ir ) use precise photometric redshifts from the 



NEWFIRM survey to measure the relationship between stellar mass and dark 
matter halo mass using HOD models. At these higher redshifts rg varies from 
~6 to --11 Mpc for stellar masses ~ 10^° Mg to 10" Mq. The galaxy bias is 
a function of both redshift and stellar mass and is ~2.5 at z ~ 1 and increases 
to ~^3.5 at z ^ 2. They find that the typical halo mass of both central and 
satellite galaxies increases with stellar mass, while the satellite fraction drops 
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at higher stellar mass, qualitatively similar to what is found at lower redshift. 
They do not find evolution in the relationship between stellar mass and halo 
mass between z ~ 2 and z ^ 1, but do find evolution compared to z ^ 0. They 
also find that the peak of star formation efficiency shifts to lower halo mass with 
time. 

Simulations can also be used to connect different observed galaxy popula- 
tio ns at different redsh ifts. An example of the power of this method is shown 
bv IConrov et al. I (l2008l) . who compare the clustering and space density of star 
forming galaxies at z 2 with that of star forming and quiescent galaxies at 
z = 1 and z = to infer both the typical descendants of the z ~ 2 star forming 
galaxies and constrain the fraction that have merged with other galaxies by 
z — 0. They use halos and subhalos identified in a ACDM N-body simulation 
to determine which halos at z ^ 2 likely host star forming galaxies, and then use 
the merger histories in the simulation to track these same halos to lower red- 
shift. By comparing these results to observed clustering of star forming galaxies 
at z '-^ 1 and 2 ~ they can identify the galaxy populations at these epochs 
that are consistent with being descendants of the z ^ 2 galaxies. They find that 
while the lower redshift descendent halos have clustering strengths similar to 
red galaxies at both z ^ 1 and z ~ 0, the z ~ 2 star forming galaxies can not 
all evolve into red galaxies by lower redshift, as their space density is too high. 
There are many more lower redshift descendents than there are red galaxies, 
even after taking into account mergers. They conclude that most z ~ 2 star 
forming galaxies evolve into typical L* galaxies today, while a non-negligible 
fraction become satellite galaxies in larger galaxy groups and clusters. 

In summary, N-body simulations and HOD modeling can be used to interpret 
the observed evolution of galaxy clustering and further constrain both cosmo- 
logical parameters and theoretical models of galaxy evolution beyond what can 
be gleaned from z observations alone. They also establish links between dis- 
tinct observed galaxy populations at different redshifts, allowing one to create 
a coherent picture of how galaxies evolve over cosmic time. 



9 Voids and Filaments 

Redshift surveys unveil a rich structure of galaxies, as seen in Fig. 3. In addi- 
tion to measuring the two-point correlation function to quantify the clustering 
amplitude as a function of galaxy properties, one can also study higher-order 
clustering measurements as well as properties of voids and filaments. 



9.1 Higher-order Clustering Measurements 

Higher-order clustering statistics reflect both th e growth of initial densit y fluc- 
tuations as well as the details of galaxy biasing ( Bernardeau et aLl . l2002l ). such 
that measurements of higher-order clustering can test the paradigm of structure 
formation through gravitational instability as well as constrain the galaxy bias. 
In the linear regime there is a degeneracy between the amplitude of fluctuations 
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in the dark matter density field and the galaxy bias, in that a highly clustered 
galaxy population may be biased and trace only the most overdense regions 
of the dark matter, or the dark matter itself may be highly clustered. How- 
ever, this degeneracy can be broken in the non-linear regime on small scales. 
Over time, the density field becomes skewed towards high density as S becomes 
greater than unity in overdense regions (where 5 = {p/p) — 1 ) but can not 
become negative in underdense regions. Skewness in the galaxy density dis- 
tribution can also arise from galaxy bias, if galaxies preferentially form in the 
highest density peaks. One can therefore use the shapes of the galaxy over- 
densities, through measurements of the three-point correlation function, to test 
gravitational collapse versus galaxy bias. 

To study higher-order clustering one needs large samples that c over enor- 



mous volumes; all studies to date have focused on low redshift galaxies. IVerde et al 



use 2dFGRS to measure the Fourier transform of the three-point corre- 
lation function, called the bispectrum, to constrain the galaxy bias without 
resorting to comparis ons with N-body simulati ons in order to measure the clus- 
tering of dark matter. iFrv fc Gaztanagal (|l993l ) present the galaxy bias in terms 
of a Taylor expansion of the density contrast, where the first order term is the 
linear term, while the second order ter m is the non-linea r or quadratic term. 
Measured on scales of 5 - 30 Mpc, IVerde et all (l2002l) find that the linear 
galaxy bias is consistent with unity (6i = 1.04 ± 0.11), while the non- linear 
quadratic bias is consistent with zero (62 = —0.05 ± 0.08). When combined 
with the redshift space distortions measured in the two-dimensional two-point 
correlation function {£,{rp,TT)), they measure rimattcr — 0.27 ± 0.06 at z = 0.17. 
This constraint on the matter density of the Universe is derived entirely from 
lar ge scale structure data alone. 

Gaztanaga et al. I (I2OO5I) measure the three-point correlation function in 2dF- 



GRS for triangles of galaxy configurations with different shapes. Their results 
are consistent with ACDM expectations regarding gravitational instability of 
initial Gaussian fluctuations. Furthermore, they find that while the linear bias 
is consistent with unity (61 — 0.93 + 0.10/ — 0.08), the quadratic bias is non-zero 
(62/61 = —0.34+0.11/— 0.08). This implies that there is a non-gravitational con- 
tribution to the three-point funct ion, resulting from galaxy formation physics. 
These results differ from those of Verde et al. 1 120021) ■ which may be due to the 
inclusion by iGaztanaea et aL ( 20051 ) of th e covariance between measurements 
on different scales. iGaztafiaga et al. ( 20051 ) combine their results with the mea- 
sured two-point correlation function to derive as = 0.88 + 0.12/ — 0.10. 

If the density field follows a Gaussian distribution, the higher-order cluster- 
ing terms can be expressed solely in terms of the lower order clustering terms. 
This "hierarchical scaling" holds for the evolution of an initially Gaussian distri- 
bution of fluctuations under gravitational instability. Therefore departures from 
hierarchical scaling can result either from a non- Gaussian initial density field 
or from galaxy bias. Red shift space hi g her-o rder clustering measureme nts in 
2dFGRS are performed by iBaugh et"aLl (|2004l) and lCroton et al.l (|2Q04al) . who 
measure up to the six-point correlation function. They find that hierarchical 
scaling is obeyed on small scales, though deviations exist on larger scales (^ 10 



32 



Mpc). They show that on large scales the higher-order terms can be sig- 
nificantly affected by massive rare pea ks such as supercluste rs, which populate 
the tail of the overdensity distribution. ICroton et alj ( 2004a ) also show that the 
three-point function has a weak luminosity dependence, i mplying that galajc; 
bias is not entirely linear. These results are confirmed by iNichol et al.l ( 200' 
using galaxies in the SDSS, who also measure a weak luminosity dependence in 
the three-point function. They find that on scales >10 Mpc the three-point 
function is greatly affect ed by the "Sloan Great Wall" , a massive supercluster 
that is roug hly 450 Mpc (iGott et al.l . [2005l) in length and is associated with tens 
of known Abell clusters. These results show that even 2dFGRS and SDSS are 
not large enough samples to be unaffected by the most massive, rare structures. 

Several stu dies have examine d highe r-order correlation functions for galaxies 
split by color. Gaztahaga et al.l ( 20051) find a strong dependence of the three- 
point function on color and luminosity on scales <6 Mpc. ICroton et al 



(|200<1) measure up to the five-point correlation function in 2dFGRS for both blue 
and red galaxies and find that red galaxies are more clustered than blue galaxies 
in all of the N-point functions measured. They also find a luminosity-dependence 
in the hierarchical scaling amplitudes for red galaxies but not for blue galaxies. 
Taken together, these results explain why the full galaxy population shows only 
a weak correlation with luminosity. 



9.2 Voids 

In maps of the large scale structure of galaxies, voids stand out starkly to the 
eye. There appear to be vast regions of space with few, if any, L* galaxies. Voids 
are among the largest structures observed in the Universe, spanning typically 
tens of Mpc. 

The statistics of voids - their sizes, distribution, and underdensities - are 
closely tied to cosmological parameters and the physical details of structure 
formation. While the two-point correlation function provides a full description of 
clustering for a Gaussian distribution, departures from Gaussianity can be tested 
with higher-order correlation statistics and voids. For example, the abundance 
of voids can be used to test the no n-Gaussianity of primordial perturbations. 



which constrains models of infiation ( Kamionkowski et al. . 20091) . Additionally 



voids provide an extreme low density en vironment in which to study galaxy 
evolution. As discussed by Peebles! ( 2001 ). the lack of galaxies in voids should 
provide a stringent test for galaxy formation models. 



9.2.1 Void and Void GalsLxy Properties 

The first challenge in measuring the properties of voids and void galaxies is 
defining the physical extent of individual voids and ide ntifying which galaxie s 
are likely to be in voids. The "void finder" algorithm of lEl-Ad &: PiranI (jl997l ). 
which is based on the point distribution of galaxies (i.e., does not perform any 
smoothing), is widely used. This algorithm does not assume that voids are 
entirely devoid of galaxies and identifies void galaxies as those with three or 
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Figure 13: Void and wall galaxies in the SDSS. Shown is a projection of a 10 
Mpc slab with wall galaxies plotted as black crosses and void galaxies plotted 
as red crosses. Blue circles indicate the intersection of the maximal sphere of 
each void with the midplane of the slab (from Pan et al. 2011). 



less neighboring galaxies within a sphere defined by the mean and standard 
deviation of the distance to the third nearest neighbor for all galaxies. All other 
galaxies are termed "wall" galaxies. An individual void is then identified as the 
maximal sphere that contains only void galaxies (see Fig. 10). This algorithm 
is widely used by both theorists and observers. 

Cosmological simulations of structure formation show that the distribu- 
tion and density of galaxy voids are sensitive to the values of Jlmattcr and 
^l^ (Kauffmann et al. . 1990). Using ACDM N-body dark matter simulations, 



Colberg et al. study the properties of voids within the dark matter dis- 



tribution and predicts that voids are very underdense (though not empty) up 
to a well-defined, sharp edge in the dark matter density. They predict that 61% 
of the volume of space should be filled by voids at z = 0, compared to 28% at 
z = 1 and 9% at z = 2. They also find that the mass function of dark matter 
halos in voids is steeper than in denser regions of space. 

Using similar ACDM N-body sim ulations with a semi-analytic model for 
galaxy evolution, iBenson et al.l (|2003f ) show that voids should contain both dark 
matter and galaxies, and that the dark matter halos in voids tend to be low 
mass and therefore contain fewer galaxies than in higher density regions. In 
particular, at density contrasts oi S < —0.6, where S = (p/pmoan) — 1, both dark 
matter halos and galaxies in voids should be anti-biased relative to dark matter. 
However, galaxies are predicted to be more underdense than the dark matter 
halos, assuming simple physically-motivate prescriptions for galaxy evolution. 
They also predict the statistical size distribution of voids, finding that there 
should be more voids with smaller radii (< 10h~^ Mpc) than larger radii. 
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The advent of the 2dFGRS and SDSS provided the first very large samples 
of voids and void galaxies that could be used to robustly measure their statis- 
tical properties. Applyi ng the "void finder" algorithm on the 2dFGRS dataset, 
Hovle fc VogelevI ( 2004 ) find that the typical radius of voids is ~15 Mpc. 
Voids are extremely underdense, with an average density of Sp/ p=-0.94, with 
even lower densities at the center, where fewer galaxies lie. The volume of space 
filled by voids is ^40%. P robing an even larger volume of space using the SDSS 
dataset, |Pan et aL (2011) find a similar typical void radius and conclude that 
~60% of space is filled by voids, which have Sp/ p=-0.85 at their edges. Voids 
have sharp density profiles, in that they remain extremely underdense to the 
void radius, where the galaxy density rises steeply. These observational results 
agree well with the predictions of ACDM simulations discussed above. 

Studies of the properties of galaxy in voids allow an understanding of how 
galaxy formation and evolution progresses in the lowest density environments 
in the Universe, effectively pursuing the other end of the density spectrum from 
cluster galaxies. Void galaxies are fo und to be significantly bluer and fainter 



than wall galaxies (|Roias et all . 120041 ). The luminosity function of void galaxies 



shows a lack of bri g ht ga l axies but no differ ence in the measured faint end 
slope (|Croton et al. I 120051: iHovle et al.l . l2005l) . indicating that dwarf galaxies 
are not likely to be more common in voids. The normalization of the luminosity 
function of wall galaxies is roughly an order of magnitude higher than that of 
void galaxies; therefore galaxies do exist in voids, just with a much lower space 
density. Studies of the optical spectra of void galaxies show that they have 
high star formation rates, low 4000A spectral breaks indicative of young stellar 
popul ations, and low st ellar masses, resulting in high specific star formation 



rates (jRojas et al.l . 120051 ). 



However, red quiescent galaxies do exist in voids, just with a lower space 



densit y than blue, star forming galaxies (jCroton et al.l . 120051 ). ICroton fc Farrar 



( 2008f ) show that the observed luminosity function of void galaxies can be repli- 
cated with a ACDM N-body simulation and simple semi-analytic prescriptions 
for galaxy evolution. They explain the existence of red galaxies in voids as re- 
siding in the few massive dark matter halos that exist in voids. Their model 
requires some form of star formation quenching in massive halos (>^ 10^^ Mq), 
but no additional physics that operates only at low density needs to be included 
in their model to match the data. It is therefore the shift in the halo mass 
function in voids that leads to different galaxy properties, not a change in the 
galaxy evolution physics in low density environments. 



9.2.2 Void Probability Function 

In addition to identifying individual voids and the galaxies in them, one can 
study the statistic al distr i bution of voids using the void probability function 
(VPF). Defined bv IWhitd (|l979l) . the VPF is the probability that a randomly 
placed sphere of radius R within a point distribution will not contain any points 
(i.e., galaxies, see Figure 11). The VPF is defined such that it depends on the 
space density of points; therefore one must be careful when comparing datasets 
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Figure 14: A schematic of the void probabihty function (VPF). The top panel 

shows the comoving distribution of galaxies in a small portion of the DEEP2 
survey (projected through 10 Mpc), while the lower panel shows a fraction 
of the empty spheres identified with a radius of 6 Mpc in the same vol- 
ume (from Conroy et al. 2005). Because the figure is projected through one 
dimension, it may appear that galaxies reside inside of identified voids; in three 
dimensions the voids contain no galaxies. 
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and simulation results to ensure that the same number density is used. The VPF 
traces clustering in the weakly non-linear regime, not in the highly non-linear 
reg ime of galaxy group s and clusters. 



Benson et al.l ( 2003 ) predict using ACDM simulations that the VPF of galax- 
ies should be higher than that of dark matter, that voids as traced by galaxies 
are much larger than voids traced by dark matter. This results from the bias of 
galaxies compared to dark matter in voids and the fact that in this model the 
few dark matter halos that do exist in voids are low mass and therefore often 



do not contain bright galaxies. ICroton et al.l (|2004bl ) measure the VPF in the 



2dFGRS dataset and find that it follows hierarchical scaling laws, in that all 
higher-order correlation functions can be expressed in terms of the two-point 
correlation function. They find that even on scales of ^^30 Mpc, higher- 
order correlations have an impact, and that the VPF of galaxies is observed to 
be different than that of dark matter in simulations. 



Conrov et al.l (|2005l ) measure the VPF in SDSS galaxies at 2: '-^ 0.1 and 
DEEP2 galaxies at z ~ 1 and find that voids traced by redder and/or brighter 
galaxy populations are larger than voids traced by bluer and/or fainter galaxies. 
They also find that voids are larger in comoving coordinates at z ^ 0.1 than at 
z ~ 1; i.e., voids grow over time, as expected. They show that the differences 
observed in the VPF as traced by different galaxy populations are entirely con- 
sistent with differences observed in the two-point correlation function and space 
density of these galaxy populations. This implies that there does not appear 
to be additional higher-order information in voids than in the two-point func- 
tion alone. They also find excellent agreement with predictions from ACDM 
si mulations that includ e semi-analytic models of galaxy evolution. 



Tinker et al.l ( 2008[ ) interpret the observed VPF in galaxy surveys in terms 



of the halo model (see Section 8 above). They compare the observed VPF in 
2dFGRS and SDSS to halo model predictions constrained to match the two- 
point correlation function and number density of galaxies, using a model in 
which the dark matter halo occupation depends on mass only. They find that 
with this model they can match the observed data very well, implying that 
there is no need for the suppression of galaxy formation in voids; i.e., galaxy 
formation does not proceed differently in low-density regions. They find that 
the sizes and emptiness of voids show excellent agreement with predictions of 
ACDM models for galaxies at low redshift to luminosities of L ^ 0.2L*. 

9.3 Filaments 

Galaxy filaments - long strings of galaxies - are the largest systems seen in maps 
of large scale structure, and as such provide a key test of theories of structure 
formation. Measuring the typical and maximal length of filaments, as well 
as their thickness and average density, therefore constrains theoretical models. 
Various statistical methods have been proposed to ide ntify and c hara cterize 
the morphologies and properties of filaments (e.g. iSousbie et al. (jioOS) and 
references therein). 
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Figure 15: Filaments identified in the SDSS galaxy distribution (from Sousbie et 
al. 2008). Individual filaments are shown in green overlaid on the galaxy density 
field show in purple. The Sloan Great Wall is identified in the foreground, lying 
between the red arrows. 



In terms of their sizes, the largest length scale at which filaments are sta- 
tistically significant, and hence identified as real objects, is 50-80 Mpc, ac- 
cording to an analysis of g alaxies in the Las Camp anas Redshift Survey (LCRS; 
IShectman et al. ■ Il996l ) bv lBharadwai et all (|2004[) . They show that while there 
appear to be filaments in the survey on longer scales, these aris e from chance 
alignm ents and projection effects and are not real structures. ISousbie et al.l 
( 2008[ ) identify and study the length of filaments in SDSS, by identifying ridges 
in the galaxy distribution using the Hessian matrix {d"^ p/dxidxj) and its eigen- 
values (see Fig. 12). They find excellent agreement between observations and 
ACDM numerical predictions for a flat, low ilmatter Universe. They argue that 
filament measurements are not highly sensitive to observational effects such as 
redshift space distortions, edge effects, incompleteness or galaxy bias, which 
ma kes them a r obust test of theoretical models. 

iBond et al. I (l2010l) use the eigenvectors of the Hessian matrix of the smoothed 
galaxy distribution to identify fllaments in both SDSS data and ACDM simula- 
tions and find that the distribution of filaments lengths is roughly exponential, 
with many more filaments of length <10 /i^^ Mpc than > 20 h^'^ Mpc. They 
find that the filament width distribution agrees between the SDSS data and N- 
body simulations. The mean filament width depends on the smoothing length; 
for smoothing scales of 10 and h^^ Mpc, the mean filament widths are 5.5 and 
8.4 Mpc. In ACDM simulations they find that the filamentary structure 
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in the dark matter density distribution is in place by z = 3, tracing a similar 
pattern of density ridges. This is in contrast to what is found for voids, which 
be come much more p rominent and low-de nsity at late r cosm ic epochs. 



ome mucn more p romment ana iow-ae nsity at late r cosm ic epocns. 
Choi et all ( 20101 ) use the methods of Bond et al. ( 2010l ) to study the evo- 



lution of filamentary structure from z ~ 0.8 to 2; ~ 0.1 using galaxies from the 
DEEP2 survey and the SDSS. They find that neither the space density of fila- 
ments nor the distribution of filament lengths has changed significantly over the 
last seven Gyr of cosmic time, in agreement with ACDM numerical predictions. 
The distribution of filament widths has changed, however, in that the distribu- 
tion is broader at lower redshift and has a smaller typical width. This observed 
evolution in the filament width distribution naturally results from non-linear 
growth of structure and is consistent with the results on voids discussed above, 
in that over time voids grow larger while filaments become tighter (i.e. have a 
smaller typical width) though not necessarily longer. 



10 Summary and Future 

This overview of our current understanding of the large-scale structure of the 
Universe has shown that quantitative measurements of the clustering and spa- 
tial distribution of galaxies have wide applications and implications. The non- 
uniform structure reveals properties of both the galaxies and the dark matter 
halos that comprise this large-scale structure. Statistics such as the two-point 
correlation function can be used not only to constrain cosmological parameters 
but also to understand galaxy formation and evolution processes. The advent 
of extremely large redshift surveys with samples of hundreds of thousands of 
galaxies has led to very precise measurements of the clustering of galaxies at 
z ^ 0.1 as a function of various galaxy properties such as luminosity, color, and 
stellar mass, influencing our understanding of how galaxies form and evolve. 
Initial studies at higher redshift have revealed that many of the general cor- 
relations that are observed between galaxy properties and clustering at z ~ 
were in place when the Universe was a fraction of its current age. As larger 
redshift surveys are carried out at higher redshifts, much more can be learned 
about how galaxy populations change with time. Theoretical interpretations of 
galaxy clustering measurements such as the halo occupation distribution model 
have also recently made great strides in terms of statistically linking various 
properties of galaxies with those of their host dark matter halos. Such studies 
reveal not only how light traces mass on large scales but how baryonic mass and 
dark matter co-evolve with cosmic time. 

There are many exciting future directions for studies of galaxy clustering 
and large-scale structure. Precise cosmological constraints can be obtained using 
baryon acoustic os cillation signatures obse rved in clustering measurements from 
wide- area surveys ( Eisenstein et al. . 20051) . Specific galaxy populations can be 



understood in greater detail by comparing their clustering properties with those 
of galaxies in general. For example, the clustering of different types of active 
galactic nuclei (AGN) can be used to constrain the AGN fueling mechanisms. 
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lifetimes, and host galaxy populations (jCoil et al. . 120091) ■ As discussed above. 



measurements of galaxy clustering have the power to place strong constraints 
on contemporary models of galaxy formation and evolution and advance our 
understanding of how galaxies populate and evolve within dark matter halos. 

The author thanks James Aird, Mirko Krumpe, and Stephen Smith for pro- 
viding comments on earlier drafts of the text. 
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